We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. Image for the cookie policy date
Big Data is no longer under active development.
Syncfusion Feedback
  • BIG DATA PLATFORM
  • DEVELOPER SDK
  • WHY SYNCFUSION
  • FAQ
  • RESOURCES

The Syncfusion Big Data Platform

The Syncfusion Big Data Platform is the first and the only complete Hadoop distribution designed for Windows and Linux. Develop on Windows using familiar tools and deploy on both Windows and Linux.

In today's data-driven world, managing large quantities of data has become an important ingredient in business success. Managing big data comes with several challenges, including cost-effective storage and querying across structured and unstructured data. It also offers tremendous potential. Imagine not having to decide which of your data is valuable. Imagine being able to store any amount of data using commodity hardware with linear scalability. The Hadoop environment makes all of this, and much more, possible today.

And now, Syncfusion has made these powerful technologies available on both Windows and Linux. With the Syncfusion Big Data Platform, you have complete access to the Hadoop environment. By adopting our platform, you are using an industry-tested solution currently employed by companies such as Microsoft, Facebook, Amazon, Adobe, Hulu, LinkedIn, and Yahoo.

Enterprise-grade Hadoop cluster

The Syncfusion Big Data Platform includes a complete production environment that can run Hadoop jobs in a scalable manner on a full cluster. The included cross platform Cluster Manager application makes provisioning easy, allowing you to manage and monitor multiple-node Hadoop clusters on both Windows and Linux.

Big Data Video Icon
Hadoop Clusters on Windows and Linux

Easily create production Hadoop clusters on Windows and Linux

With the Syncfusion Hadoop distribution, in minutes you can create clusters using commodity machines running the most recent versions of Windows (Windows 7/Windows Server 2008 and later) and on Linux (Ubuntu, CentOS).

Manage Multiple Clusters

Manage and monitor multiple clusters

Our cluster management system comes with built-in support for managing multiple clusters. Easily manage and monitor multiple clusters. All you need is a web browser.

Job Monitor Details

Easily monitor job status

The included monitoring system provides information on the general health of your cluster. It also provides detailed metrics on nodes that are part of the cluster and jobs running on the cluster.

Cluster Upgrade

Hadoop cluster upgrade

You can easily perform Hadoop rolling upgrade to update your Hadoop cluster from an older version to a newer one without incurring data loss or downtime in accessing HDFS. You can also upgrade your SDK packages that are shipped with Hadoop.

PySpark and Scientific Python

Integrated support for PySpark and Scientific Python

Cleanly integrated support for the scientific Python stack with features such as integrated Python package management. Commonly used packages are bundled with an easy to use interface to update and add packages as needed. Visualize large amounts of data using Spark and rich visualization features available as part of the included IPython stack.

Caching Spark SQL

Caching - Spark SQL

Spark excels in processing in-memory data. It supports loading data into a cluster-wide in-memory cache to deliver the best performance in data visualization and retrieval. With Cluster Manager, you can cache tables in the Spark Thrift Server with a few clicks.

Prepare and Manage Oozie jobs

Prepare and manage Oozie jobs

Oozie is a workflow scheduler system to manage Hadoop jobs. The Syncfusion Big Data Platform makes the job submission and monitoring easier by providing a user-friendly interface to submit jobs and a dashboard for monitoring.

Prepare and Run Sqoop jobs

Prepare and run Sqoop jobs

Sqoop helps you transfer data from relational databases into HDFS. The Syncfusion Big Data Platform makes the job even easier by providing a user-friendly interface to Sqoop.

Azure automation

Azure automation

You can easily create, deploy, and scale a secure Syncfusion Hadoop cluster with basic or Kerberos enabled authentication in a Microsoft Azure Virtual Machines environment in minutes. The Syncfusion Big Data Cluster Manager allows you to effectively manage the resources in Microsoft Azure with options to track billing details and shut down, restart, and destroy the virtual machines as required. You can also start and stop the virtual machines with the Hadoop cluster at scheduled intervals. Azure based Hadoop Cluster also supports Azure blob storage as default File System, With it you can easily scale the storage as well as you can switch between hot and cool access tiers.

Pseudo Node Hadoop cluster

Syncfusion Big Data Platform Sandbox

Syncfusion has published a sandbox Azure image containing pseudo node Hadoop cluster. You can easily provision the image from Azure portal itself and start exploring the pseudo node Hadoop cluster once the VM is up and running. You can also directly connect it with Big Data Studio without using VPN and submit jobs.

Secure Hadoop Cluster

Create secure Hadoop cluster

You can easily create a Kerberos enabled secure Hadoop cluster in both Windows and Linux within minutes. The Syncfusion Big Data Cluster Manager facilitates seamless integration with Active Directory Server. You can easily manage the access control for HDFS, Hive, and HBase to the users.



Syncfusion Big Data Studio

The Syncfusion Big Data Studio provides an easy-to-use environment to work with popular big data tools such as Spark, HBase, Pig and Hive. It also provides direct access to the Hadoop Distributed File System, HDFS. The Big Data Studio ships with a local install of the Syncfusion Big Data SDK, which provides a complete working Hadoop distribution right on your laptop. No virtual machines are needed, so there is no need to juggle between Linux and Windows. You don’t even have to be connected to a cluster to work on Hadoop jobs. You can work with Hadoop on your Windows machine, even when offline, and then deploy to a cluster for production when you are ready.

BigData Development Video Icon
HDFS explorer

Interactive HDFS explorer

Syncfusion Big Data Studio includes a full-fledged explorer-UI that allows for easy interaction with files stored within HDFS.

Work Interaction

Work interactively with Pig, Hive, Spark and HBase

Work interactively with Pig, Hive, HBase and Spark(Scala, Python, IPython and Spark SQL). Syncfusion Big Data Studio provides an interactive command line interface and a rich editor for working with Pig, Hive, HBase and Spark. Use the power of a read-eval-print loop to get work done quickly.

Sqoop jobs

Prepare and run Sqoop jobs

Sqoop helps you transfer data from relational databases into HDFS. The Syncfusion Big Data Platform makes the job even easier by providing a user-friendly interface to Sqoop.

Submit Jobs in Hadoop Clusters

Submit jobs in secure Hadoop cluster

You can easily add a Kerberos enabled secure Hadoop cluster and submit jobs in Hadoop, Sqoop, Pig, Hive, Spark, and HBase within Big Data Studio.

Experimental connect support

Support to connect with Azure Cluster

You can directly connect with Syncfusion Hadoop cluster running in Azure without the need for VPN connection. Traffic is sent over SSL and the connection is authenticated.




Why Syncfusion

Frequently Asked Questions

Collapse All

Licensing

Can the entire product, including the production cluster, be used commercially?

Yes, if you qualify for a community license or if you obtain a commercial license.

Support

What are the support options that are available?

Forum support is available to everyone for free. Commercial support under a defined SLA is available for an annual fee. Details of the fees are given in the following FAQ.

Are there limits to the number of incidents that can be submitted?

No. As a matter of policy, Syncfusion does not normally limit the number of support incidents for those under commercial support.

How can I submit feature requests?

You can log feature requests through our Direct-Trac customer-service portal.

Where do I report bugs?

You can report bugs using our Direct-Trac customer-service portal.

Does Syncfusion offer paid consulting services in the big data domain?

Yes. Please contact us for additional information.

Benefits for Syncfusion customers

Are there special benefits for Syncfusion customers?

Yes. If you are a Syncfusion Plus member, you as a named user, will automatically receive commercial support for one cluster up to a maximum of 5 nodes (cluster limit is a per-organization limit). In addition, current Syncfusion Global Enterprise License holders receive commercial support at the Platinum level for no charge.

I have a Syncfusion Essential Studio Community License. Do I receive access to the Big Data Platform?

Yes. You will receive commercial support on a single cluster for up to 5 nodes.

Software Requirements

What are the requirements to run the Syncfusion Big Data Platform?

Windows 7 or a later version with .NET Framework 4.5 and Linux (Ubuntu, CentOS).For a production cluster, we recommend that you deploy on Windows Server 2008 and later or Linux (Ubuntu, CentOS).

Do I need to install Cygwin, Python, Java, etc?

No. We automatically handle all dependencies. Cygwin is not used by the Syncfusion Big Data Platform.

Hardware Requirements

Do you recommend specific types of hardware?

The Syncfusion Big Data Platform runs on a variety of hardware. We recommend you start with new hardware with 16 GB of RAM or more, and one or more disks with sufficient capacity for your needs. RAID is not required for data nodes. RAID is a good idea for name nodes. Specification-wise, you can certainly go as high as you want once your needs expand. You can contact us for specific advice based on your requirements.

As an example, we have a system for gathering and summarizing product metrics running on a small cluster with off-the-shelf, desktop-quality hardware. It has delivered tremendous value over the past few months. We also run other clusters that are much larger. Don't feel the need to scale up when you get started. Start with the basics. One of the nice things is that it is very simple to scale up as you need.

Technical

Can the Syncfusion Big Data Platform be used to configure clusters using virtual machines on cloud servers?

Yes. We test this use-case using Microsoft Azure and other cloud providers.

Is there support for YARN?

Yes. The core version of Apache Hadoop we ship is 2.5.2 or higher, and it comes with complete support for YARN.

Is there complete support for Apache Pig?

Yes.

Is there complete support for Apache Hive?

Yes.

Is HBase supported?

Yes.

Can I connect to data stored on HDFS from my C# applications?

Yes. We ship several samples with the Syncfusion Big Data Studio installation.

Can I use C# to author MapReduce applications?

Yes. We ship several samples with the Syncfusion Big Data Studio install. You can write code or use third-party assemblies as needed.

Can I use Java to author MapReduce applications?

Absolutely. We ship several samples with the Syncfusion Big Data Studio install.

Can I use Python or other languages?

Absolutely. We ship several Python samples with the product.

Do I have to make any additional changes to enable high availability on the cluster?

No. Support for high availability is built-in and works out of the box with Syncfusion Big Data Platform.

Is Syncfusion Big Data Platform compatible with Microsoft HDInsight?

Yes. Our platform is compatible with the Microsoft HDInsight platform. Identical code will run on both platforms. Please note that our interactive studio environment does not however support direct connections to the HDInsight platform.

Is there a limit on cluster size?

Not that we know of. From a practical perspective, we test clusters of up to 100 nodes. Custom tuning will almost certainly be required for very large clusters. Contact us for assistance.

Do I need to have DNS configured properly to install Syncfusion Big Data Platform? Can I use IP addresses instead?

You need to have DNS and reverse DNS configured properly to install Syncfusion Big Data Platform. Configuration with IP addresses is not recommended and is not supported by the cluster manager. If you provide IP addresses, we will translate these into host names, assuming DNS is configured as expected.

Other questions

How does Syncfusion Big Data Platform compare with distributions from other vendors?

Most current distributions are focused on the Linux platform alone. They are not easy to setup and maintain on Windows. We aim to provide a solid, big data platform tailored for both Windows and Linux. Additionally, the platform is designed to offer a lot of additional features such as a powerful cross platform cluster manager that minimize the efforts required to get started.

About Syncfusion

Who is Syncfusion?

We have been in business since 2001. We are one of the largest providers of software frameworks in the world. We provide frameworks to approximately half of the Fortune 500 companies, and we globally have over half a million users. Some of the best-known software packages in the world are built with our technology under the hood.

HELP RESOURCES